While you are waiting have you done the prerequisites on your computer?
install latest version of R
install latest version of Rstudio
install packages
#install.packages(tidyverse)
#install.packages(readxl)
** If you are having trouble let one of us know **
4 parts - with challenges to practice your knowledge - ask lots of questions!
We will be live coding so follow along! - the best way to learn is to try it yourself onyour own computer
Instead of having your hand up and coding one handed we have this system instead Red - help! Green - Good - when we are not teaching we will act as helpers so put up a red stickie if you need our help
For this workshop, we will be using R via RStudio.
You can think of R like a car’s engine, while RStudio is like a car’s dashboard.
ModernDive, Figure 1.1.
So what this means is that, just as we don’t drive a car by interacting directly with the engine but rather by interacting with the car’s dashboard, we won’t be using R directly.
Instead, we will be using the RStudio’s interface.
ModernDive, Figure 1.2.
After you open RStudio, you should see the following 3 panels:
## Code basics - typing into the console then pressing enter, R will compute immediately - typing into an R script will allow you to re-run your code over and over (reproducible) - workflow is usually:
cmd + enter Windows control + enter1 + 2 # do basic math
## [1] 3
print("hello") # displays text - sends it into the console
## [1] "hello"
using the # symbol - not read by R - use this to take notes during the workshop
a <- 1 + 2 * 3 / 4 ^ 5
b <- "hello"
a <- 3
RStudio will highlight these (syntax highlighting) - these are only some commonly used types, you will encounter some others later in this workshop
"abc"# Character (always in quotations)
## [1] "abc"
42 # Numeric
## [1] 42
3.14 # Double
## [1] 3.14
TRUE && FALSE # Logical
## [1] FALSE
NA # null - NA
## [1] NA
#check your type by using a function
str(a)
## num 3
Know your types! It will be important when troubleshooting later (somehthings you can do with some but not with others)
base R comes with some functions like str()
sum(1,2,3) #adds all the numbers
## [1] 6
mean(1,2,3)
## [1] 1
help(str)
try loking at the help documentation for paste - read how it works and try using it - what does it do?
R for data science book https://r4ds.had.co.nz/ Google & Stackoverflow combine two characters in R
But what if you want to do more?
R packages extend the functionality of R by providing additional functions, data and documentation
ModernDive, Figure 1.4.
So let’s continue with this analogy: Let’s say you’ve purchased a new phone (brand new R/RStudio install) and you want to take a photo (do some data analysis) and share it with your friends and family. So you need to:
This process is very similar when you are using an R package. You need to:
install.packages("tidyverse")
library(tidyverse)
See ModernDive Chapter 1 for further reading.
One day you will need to quit R, go do something else and return to your analysis later.
One day you will be running multiple analyses in R and you want to keep them separate.
One day you will need to bring data from the outside world into R and present results and figures from R back out to the world.
So how do you know which parts of your analysis is “real” and where does your analysis “live”?
Working directory is where R will look, by default, for files you ask it to load or to save.
You can explicitly check your working directory with:
getwd()
## [1] "/Users/Jasmine/tmp/introRworkshop_Sept19/lessons"
It is also displayed at the top of the RStudio console
Figure 1.5. Find my path
DO NOT USE setwd unless you want Jenny Bryan to set your computer on fire!
Figure 1.6. Don’t setwd()
Figure 1.7. Don’t setwd()
So what’s wrong with:
setwd("/Users/amy/fuzzy_alpaca/cute_animals/foofy/data")
df <- read.delim("raw_foofy_data.csv")
p <- ggplot(df, aes(x, y)) + geom_point()
ggsave("../figs/foofy_scatterplot.png")
The chance of the setwd() command having the desired effect - making the file paths work - for anyone besides its author is 0%. It might not even work for the author a year or two from now. So essentially your data analysis project is not self-contained and protable, which makes recreating the plot impossible.
Read more here: https://www.tidyverse.org/articles/2017/12/workflow-vs-script/
Typically, I organize each data analysis into a project using RStudio Project. I tend to have a directory each for: